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Abstract 


—  In  distributed  systems,  it  is  useful  to  classify  persistent  objects  as  either  immutable 
or  mutable.  The  contents  of  an  immutable  object  cannot  be  changed  while  the  contents 
of  a  mutable  object  can.  In  a  distributed  system,  multiple  copies  of  an  immutable 
object  can  exist  at  different  places  and  can  be  used  freely  without  the  need  for  any 
special  synchronization.  Mutable  objects,  however,  require  synchronization.  When 
an  object  is  about  to  be  changed,  all  current  users  need  to  be  notified.  When  two  users 
both  try  to  change  the  same  object,  only  one  should  be  permitted  to  succeed.  This  kind 
of  synchronization  requires  that  for  each  mutable  object  there  be  a  single  point  in  the 
network  that  controls  use  of  that  object.  If  a  network  becomes  temporarily  partitioned 
into  two  isolated  subnetworks,  only  one  will  have  control  over  each  mutable  object. 

This  paper  considers  a  class  of  objects  called  incrementally  mutable  objects  that 
are  intermediate  between  mutable  and  immutable  objects.  Intuitively  the  only  per¬ 
mitted  modifications  to  an  incrementally  mutable  object  are  those  that  add  new  infor¬ 
mation  to  the  object  while  preserving  existing  information.  Changes  to  incrementally 
mutable  objects  do  not  require  central  synchronization.  When  a  network  becomes 
partitioned,  the  same  incrementally  mutable  object  can  be  safely  modified  in  each 
subnetwork.  A  mutable  object  can  be  modeled  by  a  set  of  immutable  objects  that 
represent  each  value  of  the  object  over  time  and  an  incrementally  mutable  object  that 
relates  each  immutable  object  to  its  successor.  Multiple  successors  are  permitted  to 


represent  parallel  changes. 
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1  Introduction 


The  demand  for  software  is  steadily  increasing  both  in  the  number  of  systems  being 
built  and  in  the  complexity  of  these  systems.  Unfortunately,  demand  for  software 
has  been  consistently  growing  faster  than  our  ability  to  produce  it.  Software  is  often 
delivered  late,  costs  much  more  than  originally  predicted,  and  fails  to  satisfactorily 
do  the  job  for  which  it  was  built.  The  combination  of  these  problems  has  been 
characterized  as  the  software  crisis  and  has  led  to  increasing  emphasis  on  the  field 
of  software  engineering  whose  goals  are  directed  at  solving  these  problems. 

Programming  environments  have  become  a  focal  point  for  much  of  the  work 
directed  toward  improving  the  practice  of  software  engineering.  Environments  are 
increasingly  being  based  on  multiple  distributed  machines  connected  w  ith  both  local 
and  wide  area  networks.  The  management  of  persistent  data  is  a  central  issue  for  en¬ 
vironments.  Environments  are  being  called  on  to  be  the  repository  of  all  information, 
both  technical  and  managerial,  created  throughout  the  lifecycle  of  a  software  system 
from  requirements  though  deployment  and  continued  enhancement  Most  current 
programming  environments  support  persistent  data  by  using  a  file  system  with  one  or 
more  ad  hoc  databases.  Many  future  environments  will  be  based  on  object-oriented 
databases  that  combine  methods  used  in  traditional  databases  with  the  programming 
language  concepts  of  objects  and  abstraction  [1].  These  new  environments  will  differ 
in  several  important  ways  from  traditional  database  systems  [2,3]. 

This  paper  focuses  on  one  important  aspect  of  future  programming  environments: 
how  to  manage  evolution  of  data  in  a  distributed  system.  First,  evolution,  existing 
methods  for  providing  it,  and  die  weaknesses  of  these  methods  are  considered.  Sec¬ 
ond,  some  motivations  for  a  new  approach  are  discussed.  Third,  a  new  approach, 
incremental  mutability,  that  eliminates  these  weaknesses  is  introduced.  Finally,  two 
examples  of  incremental  mutability  are  presented. 

2  Evolution 

The  information  in  a  software  environment  evolves  over  time.  Not  only  are  new 
objects  added,  but  existing  objects  must  also  evolve. 

The  simplest  mechanism  for  evolution  is  to  permit  all  objects  to  be  changed. 
However,  changing  an  object  presents  special  problems  when  multiple  users  are 
involved,  particularly  when  the  environment  is  on  a  distributed  system.  When  two 
users  are  changing  the  same  object  then  the  changes  of  one  may  overwrite  the  changes 
of  the  other  or  worse  the  resulting  object  may  contain  some  combination  of  the  partial 
changes  of  each  user.  Many  environments  provide  a  locking  mechanism  so  that  only 
one  user  can  change  an  object  at  any  time.  This  approach  has  two  problems.  First, 
while  one  user  is  changing  an  object,  all  others  who  need  to  change  that  object  are 
locked  out  and  must  wait  until  the  object  is  unlocked  before  they  can  proceed  with 
their  work.  Since  objects  can  be  locked  for  long  limes,  a  loss  of  productivity  may 
occur.  Second,  the  locking  mechanism  requires  that  control  of  an  object  reside  at  a 


single  point  within  a  distributed  network.  This  can  be  seen  by  considering  two  users 
at  different  points  in  the  network  both  trying  to  change  the  same  object.  Suppose 
that  the  network  is  partitioned  by  breaking  all  paths  between  the  two  users.  Now,  at 
most  one  user  will  be  able  to  change  the  object,  because  control  of  the  object  resides 
in  the  part  of  the  network  where  changes  to  the  object  can  be  made.  The  place  where 
control  of  an  object  resides  will  be  called  the  control  point  for  that  object  A  control 
point  for  an  object  resides  at  some  single  point  within  a  distributed  system  and  limits 
access  to  operations  of  die  object  Although  control  points  on  a  single  centralized 
machine  have  proved  useful  and  effective,  they  create  problems  in  distributed  systems 
that  are  discussed  in  the  next  section. 

If  historical  information  is  to  be  preserved,  evolution  cannot  be  done  by  simply 
changing  the  existing  object.  Instead,  a  version  scheme  is  needed  that  records  a 
sequence  of  objects,  each  of  which  represents  the  state  of  some  changeable  object 
at  a  different  point  in  time.  Instead  of  changing  an  object,  a  new  version  of  the 
object  is  created  in  which  all  changes  have  been  made.  Once  created,  a  version 
cannot  be  changed.  This  ensures  that  historical  information  is  preserved.  Versions 
need  not  be  totally  ordered.  When  alternatives  occur,  such  as  when  a  bug  is  fixed 
in  an  old  release  while  work  continues  on  the  next  release,  the  sequence  can  fork. 
When  alternatives  come  together,  separate  sequences  can  join.  Abstractly,  a  directed 
acyclic  version  graph  is  formed.  Not  all  points  in  the  version  graph  are  equally 
important;  in  practice,  users  impose  additional  structure  at  one  or  more  levels  of 
granularity  and  do  not  preserve  versions  below  some  minimum  level  of  granularity. 
The  finest  granularity  corresponds  to  every  edit.  A  coarse  granularity  corresponds 
to  major  release  points.  Intermediate  granularities  are  frequently  defined  to  aid  the 
management  of  a  development  project.  Two  common  methods  for  supporting  versions 
are  discussed  below:  naming  schemes  and  version  control  programs. 

Source  versions  are  often  handled  by  conventions  for  naming  directories  and 
files.  One  primitive  approach  divides  the  world  into  three  groups  of  directories: 
old,  current,  and  new.  Most  users  will  use  objects  in  current.  Versions  under 
development  and  experimental  versions  reside  in  new.  When  a  new  object  is  stable, 
the  current  version  of  that  object  is  moved  to  old  and  the  new  version  is  moved 
to  current.  This  approach  has  two  disadvantages.  First,  only  three  versions  of  an 
object  are  kept.  In  practice,  many  more  than  three  are  needed.  Second,  the  current 
state  of  the  system  is  constantly  changing.  The  behavior  of  any  uses  of  current 
objects  can  change  in  unexpected  ways  without  any  notification.  The  user  is  never 
sure  that  what  worked  yesterday  will  work  the  same  today. 

A  more  general  approach  is  to  use  names  vi ,  V2,  V3,  and  so  forth.  Although  this 
approach  cm  be  extended  to  represent  version  graphs  with  folks  and  joins,  naming 
can  get  complex.  Name  qualification  can  occur  either  at  the  directory  or  the  file  level. 
For  example,  if  “/"  separates  directory  names  and  can  appear  as  a  character 
within  file  names,  two  versions  of  the  object,  f  oo .  x,  could  be  represented  either  by 
directory  names  such  as 
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/f oojpro ject /sourcel /f oo . x 
/foo_pro ject/source2/foo.x 

or  by  file  names  such  as 

/foo_project/source/foo.x. 1 
/f oo_pro ject /aource/f oo . x . 2 

When  directory  names  are  used  and  one  object  in  that  directory  is  changed,  then  all 
other  objects  in  the  old  directory  must  be  copied  unchanged  into  the  new  directory 
resulting  in  multiple  identical  files  that  represent  a  single  logical  object.  1  When  file 
names  are  used,  the  consistency  relationships  between  versions  of  different  objects 
are  no  longer  explicit  and  must  be  separately  maintained.  Another  disadvantage  of 
naming  schemes  is  that  command  scripts  need  to  be  aware  of  the  version  naming 
conventions.  For  example,  a  command  script  that  references  a  VI  object  will  have 
to  be  edited  before  it  can  be  used  for  a  V2  object.  2 

Several  programs  have  been  developed  for  version  control  including  the  Unix 
SCCS  tool,  a  similar  but  improved  Unix  tool  RCS  [4],  and  most  recently  the  Apollo 
DSEE  system  [S].  These  tools  support  version  histories  of  individual  objects  including 
those  whose  version  graphs  have  both  forks  and  joins.  All  versions  of  an  object  are 
stored  in  a  single  physical  file  using  delta  encoding  to  save  space.  In  SCCS  and  RCS, 
before  any  particular  version  of  an  object  can  be  used,  a  copy  of  it  must  be  explicitly 
extracted  from  the  physical  file.  After  the  copy  is  changed  it  must  be  explicitly 
copied  back  as  a  new  version  into  the  physical  file.  This  approach  not  only  requires 
the  user  to  do  these  extra  explicit  operations,  but  also  places  the  burden  on  the  user  of 
maintaining  the  logical  relationship  between  extracted  copies  and  the  master  version. 
In  DSEE,  the  user  specifies  a  configuration  that  lists  specific  versions  of  each  object 
that  the  user  wants  to  see.  At  this  point  transparent  access  to  those  specific  versions 
is  provided.  Since  logically  no  copy  occurs,  consistency  is  automatically  maintained. 

When  a  single  version  has  multiple  successors,  all  these  tools  designate  some 
single  version  as  the  primary  successor.  Starting  with  the  first  version  of  the  object 
and  following  the  path  via  primary  successors  will  end  at  a  version  that  is  designated 
as  the  current  version.  Users  can  request  either  a  specific  named  version  or  alter¬ 
natively  can  request  the  current  version.  Requesting  the  current  version,  however, 
has  exactly  the  same  problem  of  unexpected  changes  as  the  old-current-new  naming 
scheme.  Since  there  is  only  one  primary  successor,  only  one  user  can  create  it  3 
Control  over  which  user  can  create  the  primary  successor  therefore  must  rest  with  a 
control  point  with  all  the  resulting  disadvantages  discussed  below. 

'Many  systems  provide  link i  that  can  be  uied  to  avoid  (hit  copying,  but  only  at  tome  coat  in  structural 
complexity . 

*The  edit  can  in  tome  lyatemi  be  avoided  by  palling  the  version  ai  a  string  parameter  that  ii  then 
inaerted  into  the  right  place  in  the  tile  name 

*bi  SCCS,  it  ia  even  worse  since  only  one  nier  can  be  changing  eny  version  of  a  abject  at  the  tame 
time.  Tbit  completely  inhibits  parallel  development. 
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3  Motivation 


In  the  next  section,  a  new  approach  to  evolution  called  incremental  mutability  is  pro¬ 
posed.  Incremental  mutability  is  particularly  well  suited  to  distributed  environments. 
Three  aspects  of  distribution  motivate  this  approach:  the  problems  of  centralized 
control  points,  the  advantages  of  immutable  objects,  and  the  ways  that  groups  of 
people  manually  synchronize  their  work. 

As  discussed  above,  a  control  point  for  an  object  is  some  single  location  in  the 
network  that  centralizes  control  of  the  operations  that  can  be  done  simultaneously  on 
that  object  Two  examples  of  control  points  were  discussed  in  the  previous  section: 
the  lock  that  prevents  multiple  users  from  simultaneously  changing  the  same  object 
and  the  control  that  determines  which  user  can  create  the  primary  successor  of  some 
single  version.  The  use  of  control  points  in  distributed  systems  has  several  problems: 

•  Increased  network  traffic.  When  the  user  of  an  object  and  the  control  point 
for  the  object  are  at  different  points  in  a  network,  messages  must  be  sent 
between  the  user  and  the  control  point  for  each  user  operation.  Consider  as  an 
example  a  conventional  tree  structured  file  system,  such  as  the  Sun  Network 
File  System  [6],  supported  across  a  distributed  system.  Any  user  file  creation 
or  deletion  requires  interaction  with  the  control  point  for  the  directory  in  which 
that  file  resides.  Such  operations  can  occur  at  a  very  high  rate  [7], 

•  Increased  user  delays.  The  round  trip  time  for  messages  communicating  with 
the  control  point  can  result  in  annoyingly  slow  response  to  user  commands. 
This  is  particularly  a  problem  when  low  speed  links  are  involved  such  as 
modems  over  telephone  lines  or  when  the  network  is  so  large  and  complex 
that  the  path  between  the  user  and  control  point  involves  many  intermediate 
machines. 

•  Lock  out  When  none  of  the  network  paths  between  the  user  of  an  object  and 
the  control  point  for  that  object  are  working,  the  user  is  completely  locked  out 
until  some  path  again  becomes  available. 

In  one  way  or  another  all  the  previously  discussed  evolution  schemes  required  a 
control  point.  A  goal  of  the  approach  proposed  below  is  to  eliminate  the  need  for 
control  points. 

A  second  motivation  is  the  advantages  of  immutable  objects  in  a  distributed  sys¬ 
tem.  An  immutable  object  is  simply  one  whose  value  cannot  be  changed.  Immutable 
objects  are  the  obvious  way  to  capture  history.  Most  version  management  schemes 
treat  previous  versions  as  immutable.  In  a  distributed  system,  identical  copies  of 
each  immutable  object  can  exist  at  different  places  within  the  system.  This  approach 
can  ideally  be  regarded  as  an  implementation  strategy  where  there  is  a  single  abstract 
object  with  a  replicated  implementation.  Any  of  the  copies  of  the  object  can  be  used 
by  itself  without  reference  to  the  other  copies  or  a  centralized  control  point  Network 
traffic  can  be  reduced  by  placing  copies  of  immutable  objects  where  they  are  likely 
to  be  frequently  accessed.  By  having  a  copy  close  at  hand,  no  network  delays  will 
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occur  during  use.  Finally,  if  a  network  should  become  partitioned  by  hardware  fail¬ 
ure,  then  users  in  each  part  can  use  the  same  immutable  object  providing  each  has  a 
copy.  As  long  as  all  objects  are  immutable,  no  control  points  are  needed.  However, 
if  work  is  to  progress,  at  least  some  change  must  take  place.  One  approach  would 
be  to  structure  a  system  as  a  mix  of  both  mutable  and  immutable  objects.  Although 
such  an  approach  is  a  significant  improvement  over  a  system  in  which  all  objects  can 
be  changed,  control  points  are  still  needed.  A  second  goal  of  the  approach  proposed 
below  is  to  gain  the  advantages  of  immutability  while  still  permitting  change  so  that 
work  can  progress. 

To  understand  how  to  minimize  die  need  for  control  points,  it  is  instructive  to 
consider  how  multiple  users  working  on  the  same  system  interact  when  using  a 
programming  environment  that  provides  no  synchronization  for  object  modification. 
Here,  the  users  are  forced  to  invent  manual  methods  for  synchronization.  Other  than 
failures  that  occur  when  someone  forgets  the  state  of  a  manually  set  lock,  such  meth¬ 
ods  work  well.  An  important  distinguishing  characteristic  of  these  manual  methods  is 
the  frequency  of  the  synchronization  operations.  While  automated  approaches  often 
operate  with  a  frequency  of  many  synchronization  operations  per  second,  manual 
methods  may  have  a  frequency  of  only  a  few  operations  per  day.  While  automated 
systems  often  do  some  kind  of  synchronization  at  every  change,  manual  synchroniza¬ 
tion  occurs  only  at  relatively  rare  events  such  as  when  a  software  system  is  released. 
By  implementing  analogues  of  these  manual  methods,  control  point  interactions  can 
be  decreased.  A  third  goal  of  the  approach  discussed  below  is  to  support  automated 
methods  that  synchronize  only  at  major  events  much  like  informal  manual  methods. 

4  Incremental  Mutability 

Incremental  mutability  is  a  new  approach  to  evolution  that  eliminates  control  points, 
offers  many  of  the  advantages  of  immutability,  and  more  closely  models  informal 
user  interactions.  In  this  approach,  a  class  of  objects  intermediate  between  mutable 
and  immutable  objects  is  introduced.  For  a  mutable  object  any  change  is  permitted, 
while  for  an  immutable  object  no  changes  are  allowed.  For  an  incrementally  mutable 
object,  IMO,  the  only  permitted  modifications  are  those  that  add  new  information  to 
the  object  while  preserving  existing  information. 

Formally,  we  assume  each  IMO  has  a  type  and  that  each  type  defines  a  fixed 
set  of  permitted  operations.  There  are  three  kinds  of  operations:  create  operations 
initially  create  an  IMO,  use  operations  return  information  from  an  object,  change 
operations  change  the  value  of  the  object.  For  specification  purposes,  use  and  change 
operations  take  the  IMO  as  their  initial  parameter.  Create  and  change  operations 
return  the  IMO  as  a  result.  For  change  operations  the  initial  parameter  is  the  value 
of  the  IMO  before  the  change  and  the  result  is  the  value  of  that  same  IMO  after  the 
change.  Each  of  these  kinds  of  operations  can  optionally  have  additional  parameters. 

Formally,  IMO's  have  two  defining  properties: 

•  Monotonicity.  Invocations  of  a  use  operation  are  classified  as  either  stable  or 


unstable.  For  specification  purposes,  a  predicate  stable  can  be  applied  to  any 
use  operation  invocation  and  return  true  if  the  result  of  that  invocation  will 
never  change.  Monotonicity  requires  that  all  uses  of  an  object  that  are  stable 
remain  stable  and  produce  the  same  result  after  the  object  is  changed.  An 
object  is  monotonic  if: 

For  any  value  X,  use  operation  U,  and  change  operatic  C  of  the  object, 
and  lists  of  values  vl  and  v2 

if  stable(U(X,  vl)) 

then  stable(U(C(X,  v2),  vl))  A  (U(X,  vl)  =  U(C(X,  v2),  vl)) 

•  Commutativity.  Invocations  of  a  change  operation  are  classified  as  either  legal 
or  illegal.  For  specification  purposes,  a  predicate  legal  can  be  applied  to  any 
change  operation  invocation.  When  a  change  operation  invocation  is  illegal 
then  an  error  condition  occurs  and  the  object  is  not  changed.  A  sequence  of 
changes  is  legal  if  all  changes  in  the  sequence  are  legal.  Commutativity  re¬ 
quires  that  for  any  two  arbitrary  legal  sequences  of  changes  that  could  be  made 
to  an  object,  then  applying  all  of  the  first  followed  by  all  of  the  second  will  be 
legal  and  produce  the  same  result  as  applying  all  of  the  second  followed  by  all 
of  the  first.  An  object  is  commutative  if: 

For  any  value  of  the  object  X  and  change  operations  C\...  C„ 

where  each  C,  can  be  any  of  the  change  operations  of  the  object 
and  lists  of  values  V| . . .  v* 

LetQ,  be  AX.Ci(X,vi) 

Let  Qh  be  AX.C,(X,  v„) 

Let  51  be  Q,  o  Q2  o . . .  o  Qm 

Let  52  be  Qm*  1  °  Qm*  2  °  . . .  °  Qm 

Let  512  be  51  °  52 

Let  521  be  52  o  51 

if  legaKS\(X))  A  legal(S2(X))  then 
legal(Sl2(X))  A  legal(S2l(X))  A 
512(X)  =  S21(X) 

IMO’s  can  be  used  to  support  re-creation  in  programming  environments.  Re¬ 
creation  is  the  ability  to  go  back  to  an  old  version  of  a  software  product  and  repeat 
all  the  steps  that  were  involved  in  manufacturing  it  [8].  Manufacturing  takes  primi¬ 
tives  such  as  source  modules  and  produces  products  such  as  executable  programs  by 
performing  a  set  of  manufacturing  steps.  All  inputs  to  each  manufacturing  step,  in¬ 
cluding  the  program  used  to  perform  the  step,  either  must  be  a  primitive  or  the  result 
of  some  previous  step.  The  partially  ordered  set  of  manufacturing  steps  is  captured 
by  a  derivation  graph.  Re-creation  of  a  product  is  possible  if  all  primitive  objects 
and  the  object  that  holds  the  derivation  graph  are  either  immutable  or  incrementally 


mutable  and  if  all  operations  on  those  objects  used  during  manufacturing  are  stable. 

IMO’s  have  an  attractive  implementation  in  a  distributed  network.  First,  like 
immutable  objects,  multiple  copies  of  an  IMO  can  be  placed  at  different  points  within 
the  network.  When  an  IMO  is  changed,  the  local  copy  of  that  object  is  changed 
immediately  and  messages  requesting  the  change  are  sent  to  all  remote  copies.  The 
local  user  need  not  wait  for  those  messages  to  arrive  before  proceeding  to  further 
use  and  modify  the  IMO.  This  implementation  allows  progress  to  be  made  even  in 
the  presence  of  long  network  delays  and  failures  of  network  links.  If  a  network 
becomes  partitioned  then  any  user  who  has  access  to  any  copy  of  a  given  IMO  can 
use  and  change  it  Any  messages  being  sent  to  a  place  to  which  all  network  links 
are  currently  down  are  queued  until  some  connection  is  again  available.  If  an  IMO 
is  not  changed  again  until  all  messages  are  received  and  processed,  all  copies  of  an 
IMO  will  converge  to  the  same  value. 

The  operations  that  change  an  IMO  can  be  considered  to  be  events.  On  a  network 
basis,  events  are  only  partially  ordered.  At  any  gi’tn  place  within  the  network,  events 
will  be  seen  as  totally  ordered  in  a  way  that  is  compatible  with  the  partial  order.  The 
total  order  seen  at  different  places  will,  in  general,  be  different.  Another  way  of 
viewing  the  partial  order  of  events  is  to  consider  time  to  be  relativistic  [9].  In 
relativistic  time,  there  is  no  system-wide  absolute  clock.  Each  machine  within  the 
network  is  assumed  to  have  its  own  clock  that  progresses  at  its  own  rate.  Control 
points  can  be  thought  of  as  a  way  of  establishing  a  system  wide  total  ordering  of 
events.  IMO’s  permit  events  to  occur  in  different  orders  on  different  nodes,  thus 
eliminating  the  need  for  control  points. 

5  Example  1:  Set 

In  this  section  a  simple  example  of  an  IMO  type  is  presented,  the  set  type.  Set  objects 
can  be  used  in  a  programming  environment  for  several  purposes: 

•  Bug  Report  Set.  Here  each  element  is  a  string  that  describes  some  bug  found 
in  a  particular  source  module. 

•  Distribution  List.  Here  the  set  elements  represent  people  who  have  been  sent 
a  copy  of  some  document 

•  Property  Set.  Here  the  elements  are  references  to  other  objects.  A  property 
set  object  contains  references  to  immutable  objects  that  all  satisfy  some  specific 
property.  4 

IMO  sets  are  formally  described  below  by  specifying  a  type  for  internal  state 
and  a  set  of  operations.  For  each  operation,  algebraic  rules  give  the  semantics  of 
the  operation.  Rules  are  also  included  to  specify  when  use  operations  are  stable  and 
when  change  operations  are  legal. 

‘The  miltidi rectory  MU  discussed  in  {10]  could  be  implemented  using  property  sets. 


IMO  sets  can  have  elements  of  any  type  T. 
set  of  T 

•  Create  Operation:  create  set  of  T. 

The  create  operation  creates  an  empty  set. 

Form: 

create_set_of_TO  =>  set  of  T 

Rules: 

create_set_of_T()  =  0 

•  Use  Operation:is_member. 

This  operation  tests  for  set  membership.  Since  members  cannot  be  removed, 
is_member  is  stable  when  its  result  is  true.  Since  members  can  later  be  added, 
is_member  is  unstable  when  its  result  is  false. 

Form: 

is_member(s:set  of  T,e:T)  =>  boolean 

Rules: 

is_membcr(s,e)  =  e  €  s 

stable{  is_member(s,e))  =  is_member(s,e) 

•  Change  Operations:  insert. 

This  operation  adds  a  new  member  to  the  set  if  it  was  not  already  present. 

Form: 

insert(s:set  of  T,e:T)  =>  set  of  T 

Rules: 

insert(s.e)  =  s  u  {  e  } 

/egal(insert(s,e))  =  true 

6  Example  2:  Version  Graphs 

Mutable  objects  can  be  modeled  by  a  set  of  immutable  objects  that  represent  each 
value  of  the  object  over  time  and  an  IMO  relation  object  that  relates  each  immutable 
object  to  its  successor.  This  IMO  relation  is,  in  effect,  an  encoding  of  a  version  graph. 
Multiple  successors  can  be  used  to  represent  parallel  changes.  Multiple  predecessors 
can  be  used  to  represent  merged  development  paths. 

The  type  and  operations  for  version  IMO’s  are  given  below. 

•  T>p«.  Version  graphs  are  encoded  by  specifying  the  initial  version  and  the  links 
between  versions.  Specific  versions  are  represented  by  separate  immutable  ob¬ 
jects.  Each  of  these  objects  will  have  a  unique  identifier,  UID,  that  disiinguishs 


it  from  all  other  objects.  The  UID’s  are  then  used  in  version  graph  objects  to 
serve  as  object  references. 

vgraph=record{initial:UID,next:set  of  record{old:UID,new:UID}} 

Create  Operation:  create_vgraph. 

This  operation  creates  a  vgraph  object  initialized  to  have  only  a  single  initial 
version. 

Form: 

create_vgraph(inicUID)  =>  vgraph 

Rules: 

create_vgraph(init).initial  =  init 
create_vgraph(init).next  =  0 

Use  Operation:  initial. 

This  operation  returns  the  initial  version. 

Form: 

initial(d:vgraph)  =>  UID 

Rules: 

initial(d)  =  d.initial 
jtaWe(inilial(d))  =  true 

Use  Operation:  in  vgraph. 

This  operation  returns  true  if  some  specified  version  is  in  a  vgraph.  Like  the 
set  is_member  operation,  it  is  stable  when  its  result  is  true. 

Form: 

in_vgraph(d:vgraph,x:UID)  =>  boolean 
Rules: 

in_vgraph(d,x))  =  (x  =  d.initial)  V  (3  y,  <y,x>  e  d.next) 
sraMe(in_vgraph(d,x))  =  in_vgraph(d,x) 

Use  Operation:  predecessors. 

This  operation  returns  all  the  predecessors  of  a  given  version.  The  add  oper¬ 
ation  defined  below  guarantees  that  all  predecessors  of  a  version  are  specified 
at  the  time  the  version  is  entered  into  the  vgraph  and  that  no  predecessors  can 
later  be  added. 

Form: 

predecessors(d:vgraph,x:UID)  =>  set  of  UID 

Rules: 

predecessors(d,x)  =  {y  I  <y,x>  €  d.next} 
siaW«(predecessors(d,x))  *  in_vgraph(d,x) 


•  Use  Operation:  successors. 

This  operation  returns  all  the  successors  of  a  given  version.  Since  successors 
can  always  be  later  added,  this  operation  is  unstable. 

Form: 

successors(d:vgraph,x:UID)  =*  set  of  UID 

Rules: 

successors(d^)  *  {y  I  <x,y>  e  d.next} 
jraMe(successors(d,x)) «  false 

•  Change  Operation:  add. 

This  operation  adds  a  new  version  to  a  v graph.  All  predecessors  are  specified 
and  must  already  be  in  the  vgraph.  A  value  for  the  new  version  is  passed  to 
the  add  operation  which  returns  the  UID  of  a  new  object  with  that  value.  The 
add  operation  must  have  two  results,  the  new  vgraph  and  the  UID  of  the  new 
object.  This  is  achieved  by  returning  a  record  with  two  components,  one  for 
each  result. 

Form: 

add(d:vgraph,oid'.sei  of  UID.v.value)  ■=*■  record{d:vgraph,new.UID} 

Rules: 

add(d,old.v)  *  «d.initial.d.next  u  (<ojtew>  t  o  €  old  }>.new> 
where  new  is  the  UID  of  a  new  object  created  by  add 
whose  value  is  v 

fega/(add(d,old,v))  =  (V  o  €  old,  in_vgraph(d,o))  a  old  ^  0 

7  Conclusions 

Current  methods  for  providing  evolution  of  data  in  a  programming  environment  were 
shown  to  have  disadvantages  when  the  environment  runs  on  a  distributed  system. 
A  new  approach,  incremental  mutability,  provides  evolution  but  has  none  of  these 
disadvantages. 
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